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Abstract 

A microscopic model is developed, within the frame of the theory of quantitative traits, 
to study the combined effect of competition and assortativity on the sympatric speciation 
process, i.e. speciation in the absence of geographical barriers. Two components of fitness 
are considered: a static one that describes adaptation to environmental factors not related to 
the population itself, and a dynamic one that accounts for interactions between organisms, 
e.g. competition. A simulated annealing technique was applied in order to speed up simula- 
tions. The simulations show that both in the case of flat and steep static fitness landscapes, 
competition and assortativity do exert a synergistic effect on speciation. We also show that 
competition acts as a stabilizing force against extinction due to random sampling in a finite 
population. Finally, evidence is shown that speciation can be seen as a phase transition. 

1 The problem. 

The notion of speciation in biology refers to the splitting of an original species into two fertile, yet 
reproductively isolated strains. The allopatric theory, which is currently accepted by the majority 
of biologists, claims that a geographic barrier is needed in order to break the gene flow so as to 
allow two strains to evolve a complete reproductive isolation. On the other hand, many evidences 
and experimental data have been reported in recent years strongly suggesting the possibility of a 
sympatric mechanism of speciation. For example, the comparison of mythocondrial DNA sequences 
of cytochrome b performed by Schlieven and others PP, showed the monophyletic origin of cichlid 
species living in some volcanic lakes of western Africa. The main features of these lakes are the 
environmental homogeneity and the absence of microgeographical barriers. It is thus possible that 
the present diversity is the result of several events of sympatric speciation. An increasing number 
of studies referring both to animal and plant species lend further support to this hypothesis [21 El 

niHiiniiziiHii- 

The key element for sympatric speciation is assortative mating that is, mating must be al- 
lowed only between individuals whose phenotypic distance does not exceed a given threshold. In 
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fact, consider a population characterized by a bimodal distribution for an ecological character 
determining adaptation to the environment: in a regime of random mating the crossings between 
individuals of the two humps will produce intermediate phenotypes so that the distribution will 
never split. Two interesting theories have been developed to explain the evolution of assort at ivity. 
In Kondrashov and Kondrashov's theory ^01 disruptive selection (for instance determined by a 
bimodal resource distribution) splits the population in two distinct ecological types that are later 
stabilized by the evolution of assortative mating. The theory of evolutionary branching developed 
by Doebeli and Dieckmann JT] is more general in that it does not require disruptive selection: 
the population first converges in phenotype space to an attracting fitness minimum (as a result of 
common ecological interactions such as competition, predation and mutualism) and then it splits 
into diverging phenotypic clusters. For example ^2|, given a Gaussian resource distribution, the 
population first crowds on the phenotype with the highest fitness, and then, owing to the high 
level of competition, splits into two distinct groups that later become reproductively isolated due 
to selection of assortative mating. 

In the present paper we will not investigate the evolution of assortativity that will be treated as a 
tunable parameter in order to study its interplay with competition. In particular we will show that: 
(1) assortativity alone is sufficient to induce speciation but one of the new species soon disappears 
due to random fluctuations; (2) stable species coexistence can be attained through the introduction 
of competition; (3) competition and assortativity do exert a synergistic effect on speciation so that 
high levels of assortativity can trigger speciation even in the presence of weak competition and 
vice versa; (4) speciation can be thought of as a phase transition as can be deduced from the plot 
of variance versus competition and assortativity; (5) contrary to the traditional interpretation of 
Fisher's theorem, the mean fitness of the population does not always increase but it reaches a 
constant value (sometimes even after a decrease) as a result of the deterioration of environmental 
conditions, this result being consistent with Price's and Ewen's reformulation ^21 Fisher's 
theorem. The use of a simulated annealing method enables us to find stationary or quasi-stationary 
distribution in reasonably short simulation times. 

In Section 2 we describe our model and briefly outline its implementation, providing some 
computational details; in Section 3 we report the results of the simulations distinguishing between 
the case of flat (subsection 3.1) and steep (subsection 3.2) static fltness landscapes, while in 
subsection 3.3 evidence is given of the robustness of our algorithm with respect to variations of 
the genome length; in Section 4 we show that speciation can be regarded as a phase transition; 
flnally, in Section 5 we draw the conclusions of our study. 

2 The model. 

In order to develop a microscopic evolution model, flrst of all we have to establish how to represent 
an individual, with the requirement of obtaining the simplest (and computationally affordable) 
model still capturing the essential of phenomena under study. 

There are many possible description levels: from the single basis to domains inside a gene to 
whole allele forms. Since the mutation patterns are quite complex at lower levels, we have chosen 
to codify the allelic forms of a gene as a discrete variable gi whose value is zero for the wild-type and 
then it increases according to the biological efficiencies of the resulting protein. At this level there 
are two main ingredients: the number of efficiency levels that are observable in a real population 
and the degeneracy of each level. 

As a starting point, we study here the most simple choice, i.e. a population of haploid individ- 
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uals whose genome is represented by a string {gi,g2, . . . , Ql) of L (binary) bits. Each bit represents 
a locus and the Boolean values it can take are regarded as alternative allelic forms. In particular 
gi = refers to the wild-type allele while Qi = 1 to the least deleterious mutant. The phenotype 
X, in agreement with the theory of quantitative traits, is just the sum of these bits, x = ^^if^i- 
According to this theory, in fact, quantitative characters are determined by many genes whose 
effects are small, similar and additive. 

Even if we have studied only the simple case of two alternative allelic forms, by analogy with 
the statistical mechanics of discrete (magnetic) systems, we do not expect qualitative differences 
when a larger number of levels or moderate degeneracy is taken into account. The presence of 
epistatic interactions among genes can indeed induce different behaviors, but in this case we are 
abandoning the theory of quantitative traits. On the other hand, a large degeneracy of non-wild- 
type levels with respect to the first one can induce an error threshold-like transition but this 
transition needs also a relatively large difference in the phenotypic trait corresponding to different 
indices, a situation which in our opinion is not the typical one. 

A time step is composed by three subprocess: selection, recombination and mutation. Muta- 
tions are simply implemented by flipping a randomly chosen element of the genome from to 1 or 
vice versa. This kind of mutations can only turn a phenotype x into one of its neighbors x + 1 or 
X — 1 and they are therefore referred to as short range mutations. The fact that both mutations 
— > 1 and 1 — >■ occur with the same probability is a coarse-grained approximation, because 
mutations affecting a wild-type allele ( allele) usually impair its function, but mutations on al- 
ready damaged genes (1 allele) are not very likely to restore their activity. One should therefore 
expect that the frequency of the 1^0 mutations be significantly lower than that of the — 1 
mutations. The choice of equal frequencies for both kinds of mutations, on the other hand, can be 
justified by assuming that mutations are mostly due to duplications of genes or to transposable 
elements that go in and out from target sites in DNA with equal frequencies. Another limitation 
of our model of mutations is that the frequency of mutation is independent of the locus. The 
frequency of mutation of a long gene, for example, should be higher than that of a short gene, and 
the frequency of mutation should be also dependent on the packing of chromatin. The inaccuracies 
in our model of mutation, however, do not impair the results of the algorithm, because, as Bag- 
noli and Bezzi showed [16j, the occupation of fitness maxima mainly depends on selection, while 
mutations only create genetic variability. Moreover, the role of mutations in the present model is 
even smaller, as genes are continuously rearranged through recombination. 

Rigorously, in finite populations one cannot talk of true phase transitions, and also the concept 
of invariant distribution is questionable. On the other hand, the presence of random mutations 
should make the system ergodic in the long time limit. Unless otherwise noted, we have checked 
that the results we obtained do not depend on the size of populations, which was typically varied 
from 10^ to 10^ individuals. 

Reasonable values of the mutation rate would imply too long simulations in order to have 
independence on the initial state, especially for finite populations. In order to speed-up simulations, 
we adopted two different strategies. The first is to use a simulated annealing technique: the 
mutation rate /i depends on time as 



which roughly corresponds to keeping /i = /io (a high value, say 10/A^) up to a time t — 6, then 
decrease it up to the desired value /loo in a time interval 26 and continue with this value for the 
rest of simulation. 




(1) 
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The limiting case of this procedure is to use r = 5 = 1 and /iq = 1, which is equivalent to 
starting from a random genetic distribution. Simulations show that all initial conditions tend to 
the same asymptotic one, and that the variance reaches its asymptotic value very quickly. 

The assortativity is introduced through the mating range A which represents the maximal 
phenotypic distance still compatible with reproduction. The reproduction phase is thus performed 
in this way. We choose one parent at random in the population, while the other parent is chosen 
among those whose phenotypic distance from the first parent is less than A. The genome of the 
offspring is built by choosing for each locus the allele of the first or second parent with the same 
probability and then mutations are introduced by inverting the value of one bit with probability 
H- In our model we therefore assume absence of linkage, which is a simplification often used in 
literature. It must be remembered, however, that this simplification is only reasonable in the 
case of very long genomes distributed on many independent chromosomes. The effects of linkage 
disequilibrium will be considered in a future work. 

In this work we are interested in studying the combined effects of competition and assortativity 
on speciation. The simplest and computationally most efficient choice is that of a frequency- 
dependent but density independent fitness function. This makes the evolution of population size 
decoupled from that of the frequency distribution of phenotypes as can be shown in the mean-field 
approximation [T3]. It is thus possible to study a fixed-size population. 

In general, the number of individuals carrying phenotype x at time t is denoted by n{x, t), the 
total population size by N(t) = J2t=o ^(^' ^) (fi^^d in our simulations) and the distribution of 
phenotypes by p{x,t) = n{x,t)/N. The evolution equation for the distribution p{x,t) is: 

p{x, t) = J^PiV' t)W{x\y, z) (2) 
y,z 

p{x,t + l) = ^^p{x,t) (3) 

where p{x, t) is the frequency of phenotype x after the recombination and mutation steps, W{x\y, z) 
is the conditional probability that phenotype x is produced by parents with phenotype y and z, 
A{x,t) is the survival factor of phenotype x at time t and A{t) = t). The survival 

factor is thus the unnormalized probability of surviving to the reproductive age. The idea beyond 
this approach is quite simple: individuals with a survival factor higher than average have the best 
chances to survive. 

In general, the survival factor A{x,t) depends on time either because of environmental effects 
(say, daily oscillations, not considered here) or because the chances of surviving depend on the 
competition with the whole population, i.e. 

A{x,t) = Aix,Pit)). (4) 

In the presence of competition, a non-overlapping generation model is much faster to simulate, 
since in this case we do not need to recompute the survival factor for each individual, but only 
once per phenotype per generation. 

Since A is proportional to the survival probability, and the probability of independent events 
is the product of the probabilities of the single events, we define : 

A{x,p{t))=exp{H{x,p{t))) (5) 

and we call H the fitness landscape, or simply fitness. In this way the events contribute additively 
to H. This choice is a common one (see for instance Refs. |17t ITHt ITU]) but someone may prefer 
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A as the real fitness. Notice that A is defined up to a multiphcative constant factor, and so H 
is defined up to an additive constant (thus it may always be made positive, consistently with the 
usual biological literature). 

The fitness H can be built heuristically. First of all there is the viability Hq{x) of phenotype x, 
not depending on the interactions with other individuals. Hq{x) can be therefore defined the static 
component of fitness describing the adaptation to abiotic factors such as climate or temperature 
whose dynamics is much slower than that of a biological population (consider for instance the 
alternation of ice ages and warmer period in the geological history of earth). The next terms of 
the fitness function account for the pair interactions, the three-body interactions, etc. The general 
expression of the fitness landscape is thus: 

H{x,p) = Ho{x) + ^Hi{x\y)p{y) + H2{x\yz)p{y)p{z) + ... (6) 
y yz 

In the present work we consider only the effect of the first two terms. The static component of 
the fitness is defined as: 

Hq{x) = e'Mr)" 

We choose this function because it can reproduce several landscapes found in the literature by 
tuning the parameters (3 and F. In particular Hq{x) becomes fiatter and fiatter as the parameter 
P is increased. When /? ^ we obtain the sharp peak landscape at x = [IHl CHI ; when P = 1 the 
function is a declining exponential whose steepness depends on the parameter F; and finally when 
(3 —>■ CO the fitness landscape is constant in the range [0, F] and zero outside (step landscape). 
Some examples of the effects of F on the static fitness profile are shown in figure H 

1.2 
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Figure 1: Steep profiles of static fitness Hq{x). From top to bottom F = 1, 2, 3 and (3 = 1 

The dynamic part of the fitness has a similar expression, with parameters a and R that control 
the steepness and range of competition among phenotypes. 
The complete expression of the fitness landscape is: 

Hix,t) = Hoix) - J^e-il"^rp(y,t) (7) 

y 

The parameter J controls the intensity of competition with respect to the arbitrary reference value 
Ho{0) = 1. If a = an individual with phenotype x is in competition only with other organisms 
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with the same phenotype; conversely in the case a ^ oo a phenotype x is in competition with all 
the other phenotypes in the range [x — x + B\, and the boundaries of this competition intervals 
blurry when a is decreased. 

Let us introduce some properties of the selection phase (we do not consider here the effects of 
recombination in the reproductive phase) by means of a simple examples. Let us start with the 
case of a population in which all genotypes differ only for neutral genes (i.e. genes that do not 
affect H). In this way, also after recombination the population is phenotypically homogeneous. In 
this case the survival factor A{x, p(t)) is constant and equal to A and there is no selection at the 
distribution level. Indeed, the total number of individuals sharing the phenotype x may be affected 
by competition, and the actual number of offsprings may be reduced so that the whole population 
is lead to extinction (which is not considered in our model, since we work at fixed population). But 
selection is not able to alter the phenotypic/genotypic distribution, since all phenotypes experience 
the same level of selection. 

We now consider the case of a population composed of two phenotypes only, xi and X2 
with respective frequencies p{xi) = q and p{x2) = I — q. Suppose that the two peaks are 
far enough in the phenotypic space that there is only intra-specific competition (already con- 
sidered by the normalization of the probability distribution) and not interspecific competition; 
also assume that the static fitness landscape is fiat so that Hq{x) = 1 for both phenotypes. 
Under these conditions A(xi,p(t)) = exp(l — Jq); A{x2,p(t)) = exp(l — J(l — q)), and A = 
A{xi,'p(t))p{xi) + A{x2,p(t))p{x2) = gexp(l — Jq) + (1 — g) exp(l — J(l — g)). The evolution 
equation (jH)) has only one free component 



which exhibits fixed points for p = 0, p = 1 and p = p* = 0.5. For J > the only stable point 
is p = p*, corresponding to the minimum competition felt by each strain. Notice that, differently 
from what intuition suggests, the factor F{q) = A{xi,p{t))/A is not a monotonically decreasing 
function of q, but exhibits a minimum at g ^ 0.8. Indeed, without mutations, g = 1 (only one 
species) has to be a fixed point (no new species can be generated), so F{1) = 1, and in order to 
have a stable fixed point at g = g* = 0.5 (F(0.5) = 1), one needs F{q) < 1 for 0.5 < g < L 
Similarly, one needs F{q) > 1 for < g < 0.5. 

We now briefly review the implementation of our model. The initial population is chosen at 
random and stored in a phenotype distribution matrix with L + 1 rows and N columns. Each 
row represents one of the possible phenotypes; as the whole population might crowd on a single 
phenotype, N memory locations must be allocated for each phenotype. Each generation begins 
with the reproduction step. The first parent is chosen at random; in a similar way, the second 
parent is randomly chosen within the mating range of the first one, i.e. within the range [max{0, x— 
A}, min{L, x + A}] where x is the phenotype of the first parent. 

The offspring is produced through uniform recombination, i.e., for each locus it will receive 
the allele of the first or second parent with equal probability; the recombinant then undergoes 
mutation on a random allele with probability /i. The newborn individuals are stored in a copy of 
the phenotype distribution matrix with L + 1 rows and N columns. At this stage we compute the 
survival factor A{x,p(t)) for each phenotype and the average A. The reproduction procedure is 
followed by the selection step. As we consider a constant size population, a cycle is iterated until 
N individuals are copied back from the second to the first matrix. In each iteration of the cycle, 
an individual is chosen at random and its relative fitness is compared to a random number r with 



p'(xi) = q' 



Ajxupjt)) 
A 



■p{xi) 



qexp{-Jq) + (1 - g) exp(-J(l - g)) 




(8) 
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uniform distribution between and 1: if r < A{x, p{t))/A{t) the individual survives and is passed 
on to the next generation, otherwise a new attempt is made. 

It should be noted that both reproduction and selection steps are affected by stochastic com- 
ponents, i.e. the reproduction and survival possibility of an individual does not depend only on 
its fitness but also on accidental and unpredictable circumstances, which is quite realistic. Con- 
sider for instance, an individual colonizing a new territory: its fitness will be very high due to the 
availability of resources and lack of competition, but, as the region is still scarcely populated, it 
may be difficult to find a partner and it may not reproduce at all. Similar remarks apply to an 
individual with high fitness that is accidentally killed by a landslip or by the flood of a river. 

Another interesting research problem that we address in this paper is that of the direction of 
evolution. It is common belief that evolution realizes a continuing advance towards more sophisti- 
cated forms. Recent advances in evolutionary theory (such as the theory of punctuated equilibrium) 
and observations of evolutionary phenomena, however, seem to indicate that evolution is a largely 
unpredictable, chaotic and contingent series of events, where small fluctuations may lead to major 
catastrophes that change the future course of development. Fisher's fundamental theorem of nat- 
ural selection (201 1211 122] and Holland's schemata theorem [2n| seem to identify an evolutionary 
equivalent of free energy in the average fitness of the population. 

However, according to the modern interpretation of Fisher's theorem by Price and Ewens ^3] 
the total change in mean fitness is the sum of a partial change related to the variation in genotypic 
frequencies (operated by natural selection) and a partial change related to environmental deterio- 
ration, and Fisher's theorem predicts that the first contribution only is always non-negative. The 
schemata theorem on the other hand, states that schemas with higher than average fitness tend to 
receive an exponentially increasing number of samples in successive generations. As a consequence, 
the average fitness of the population is expected to increase as time goes by. Holland's theorem, 
however, is not general in that it is based on the assumption of fixed fitness for each phenotype. 
In realistic situations, fitness is frequency dependent, so that as the population becomes richer 
and richer in fitter individuals the intensity of competition also increases therefore counteracting 
a further increase in the frequency of the best adapted phenotypes. This increase in competition 
intensity is exactly what was meant by deterioration of the environmental conditions in Price and 
Ewens's reformulation of Fisher's theorem ^3^3- our formalism, the deterioration is explicitly 
introduced by the competition term. 

3 Results. 

We have performed a large series of simulations in order to check the role of the various parameters. 
Since the detailed discussion of the parameter effects is rather long it is reported in the Appendix 
and we summarize here the main results. The numbering of the list corresponds to sections the 
the Appendix. 

A.l. Flat static fitness landscape: 
A. 1.1 Effects of assort at ivity: 

A. 1.1.1. Random mating: in the absence of assortativity speciation is not possible even if 
competition is very intense. All phenotypes are present although the distribution 
shows peaks and valleys. 

A. 1.1. 2. Moderate assortativity: the distribution becomes sharp, all but one phenotypes 
disappear even in the absence of competition. 



7 



A. 1.1. 3. Strong assortativity (and absence of competition): coexistence of multiple species 
is possible, but this is not a stable distribution due to random fluctuations. 

A. 1.1.4. Maximal assortativity: similar to previous case but transients with many coexistent 
phenotypes. 

A. 1.2 Effects of competition: 

A. 1.2.1. Weak competition in the presence of strong assortativity: stable coexistence. 
A. 1.2. 2. Deterioration of the environment and Fisher's theorem: effective fitness decreases 
during simulations due to competition. 

A. 1.2.3. Role of competition range: as the range of competition increases the number of 

coexisting species decreases. 
A. 1.2.4. High competition level and short range: increasing the competition intensity may 

increase the number of coexisting species. 
A. 1.2. 5. Interplay between mating range and competition: an increase in competition may 

counteract an increase in the mating range. 

A. 2. Steep static fitness landscape: 

A. 2.1 Absence of competition: only one species. 

A. 2. 2 Stabilizing effects of competition: a moderate competition level may induce coexis- 
tence. 

A. 2. 3 Competition-induced spcciation: higher competition levels correspond to more species. 

A. 2. 4 Interplay between mating range and competition: an increase in competition may 
counteract an increase in the mating range also in the case of steep fitness. 

A. 3. Influence of genome length: by repeating the simulations after doubling the genome length 
we show that, after rescaling the competition and assortativity range with the genome length, 
the final population distribution also scales linearly. 

From these simulation emerges that the relevant parameters for the spcciation (for a given 
static fitness landscape) are the competition intensity and assortativity (mating range). In the 
presence of competition, mutations play almost no role. 

4 Phase diagrams. 

One of the purposes of the present work was to study the evolutionary dynamics in the widest 
possible range of competition and assortativity so as to bring to light a possible synergetic effect on 
spcciation. To be consistent with the intuitive idea that higher assortativity corresponds to shorter 
mating ranges, we define the assortativity as: A = L — A. As this kind of research requires a huge 
number of simulations, the problem arises to find an easily computable mathematical parameter 
suitable for monitoring the spcciation process. 

With this respect, one of the first candidates is the variance that, as is common knowledge, 
represents a measure of the dispersion of a distribution: 

var = / .i^i ~ xYp{xi) 
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The simulations, in fact, show that, as competition and/or assortativity is increased, the fre- 
quency distribution first widens, then it becomes bimodal and eventually it splits in two sharp 
peaks that move in the phenotypic space so as to maximize their reciprocal distance: each of this 
steps involves an increase in the variance of the frequency distribution. 

It may be argued that the choice of variance as a parameter to monitor speciation may lead 
to interpretation mistakes. As an example, in a regime of random mating and high competition 
intensity, a trimodal distribution spanning the whole phenotypic space arises (when the static 
fitness landscape is fiat): such distribution, shown in Figure 01 features a very high variance value. 
On the other hand, we also showed that in the absence of competition (J = 0), a speciation event 
may be induced by very high assortativity (A = 0, A = 1), but one of the newborn species soon 
disappears due to random fluctuations: the final distribution is therefore represented by a single 
delta-peak whose variance is obviously zero. It is therefore true that the variance does not allow to 
detect transient speciation events, but actually in the present work we are mainly concerned with 
the stable coexistence of the new species. In the interpretation of the variance values of the two 
examples we are discussing, therefore, we will not conclude that the former is closer to speciation 
than the latter but we will deduce that a trimodal distribution with three not completely split 
humps is closer to species coexistence than a single delta peak even if that represents the surviving 
species of a recent speciation event. 

Before describing the plots of variance as a function of competition and assortativity, it is 
necessary to underscore that for each couple of parameters J, A of the phase diagrams, the variance 
was averaged over 10 independent runs, so as to account for the stochastic factors influencing the 
evolutionary patterns while the plots shown in the preceding sections represent the output of a 
single run of our program. 

As we will discuss in more detail, when variance is plotted as a function of both competition 
and assortativity, the surface shows a sharp transition from a very low value to a plateau at a very 
high variance level when assortativity and competition become larger than a certain threshold. 
Analysis of the distribution shows that the sharp transition from the low to the high variance level 
is indicative of the shift from a state with a single quasi-species to a state with two or more distinct 
quasi-species. The variance plot can be therefore considered as a speciation phase diagram. 

Let's start by examining the variance plot as a function of competition and assortativity in the 
relatively simple case of a flat static fitness. 

A graphical way to illustrate the synergistic effect is to study the contour plots of the tridi- 
mensional speciation phase diagram (Figure E))- 

The contour plots divide the J, A plane in three regions: the area on the left corresponds to the 
state with a single quasi-species, the area near the upper-right corner corresponds to the state with 
two or more distinct quasi-species. The area on the right below the previous one corresponds to 
wide distributions, with peaks which are not completely isolated. The phase transition is located 
in correspondence of the wiggling of contour lines for ^ ^ 11 (A ~ 3). 

If a point, owing to a change in competition and/or assortativity, crosses these borderlines 
moving from the first to the second region, a speciation event does occur. It should be noted 
that in the case of flat static fitness, and, to a smaller extent, also in the case of steep static 
fitness, in the high competition region the contour plots tend to diverge from each other showing 
a gradual increase of the variance of the frequency distributions. This is due to the fact that, even 
if competition alone is not sufficient to induce speciation in recombinant populations, it spreads 
the frequency distribution that becomes wider and wider, and splits into two distinct species 
only for extremely high assortativity values. In this regime of high competition only the ends of 
the mating range of a phenotype x are populated and the crossings between these comparatively 
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Figure 2: Contour plots var = 5, 10, 15, 20, 25 for flat static fitness landscape. Parameters: /? = 
100, r = 14, a = 2, = 2, p = 0.5. Annealing parameters: /iq = 10"^ /loo = 10"^ r = 10000, 
6 = 3000. Total evolution time: 30000 generations. Each point of the plot is the average of 10 
independent runs. 



different individuals will create once again the intermediate phenotypes preventing speciation until 
assortativity becomes almost maximal. 

It can be noted that in the absence of competition, however assortativity is increased, the 
variance of the final distribution will always be zero. In a random mating regime {A = 0), in 
fact, the initial distribution remains bell-shaped until T = r and then, due to recombination it 
shrinks in a delta peak occupying a random position in the phenotypic space: speciation does not 
occur. Conversely, with high assortativity {A = 13, ^ = 14) the initial distribution immediately 
turns into two delta peaks at x = and x = 14 linked by a continuum of very low frequencies on 
the intermediate phenotypes. After T = r two scenarios are possible: one of the delta peaks in 
X = and x = 14 disappears and the other one survives, or, alternatively, both the extreme peaks 
disappear and a new peak arises in the central region of the space: a transient speciation event 
has occurred. 

With a weak competition level such as J = 2, even in the case of random mating, the distri- 
bution always remains bell-shaped and never turns in a delta peak: a weak level of competition 
is therefore necessary to stabilize a bell-shaped distribution. The situation does not change until 
A = 12: the distribution splits in two peaks each one covering two phenotypes. When assortativity 
is further increased to ^ = 13 and ^ = 14 three and five stable delta peaks appear respectively, 
evenly spaced in the phenotypic space so as to relieve competition. These speciation patterns 
explain the abrupt increase in variance for high assortativity. 

When J = 6 the higher competition intensity changes the patterns: for ^ = the distribution is 
always bell-shaped, but with intermediate assortativity such as ^ = 10 and ^ = 11 the distribution 
becomes bimodal and remains such during all the simulation. When A= 12 the final distribution 
is composed by three and not two peaks as in the case < J < 6: two delta peaks appear at 
the opposite ends of the space and one peak spanning two phenotypes arises in the central region. 
The higher competition intensity affects also the case A = 13: two delta-peaks appear at the 
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extreme ends of the phenotypic space, and two peaks each one covering two phenotypes appear in 
the central region (recall that only three delta peaks appeared for J = 4). Finally, for ^ = 14 we 
have the usual patterns with five evenly spaced delta-peaks. 

These patterns basically do not change for higher competition values, the only differences being 
rather quantitative than qualitative. For instance when J = 10 the distribution is bell-shaped for 
all assortativity values from ^ = to ^ = 6 but the distribution is wider and thus the variance is 
larger than the cases with weaker competition. For A in the range from 7 to 11 the distribution 
becomes bimodal: the stronger competition therefore allows the distribution to become bimodal 
with much lower assortativity levels than those required with J = 6. Finally when A = 12 three 
peaks appear but all of them and not only the intermediate one cover several phenotypes. No 
significant difference from the cases J < 10 can be detected when A = 13, 14. 

The speciation phase diagram has been studied also in the case of a steep static fitness land- 
scape. The two diagrams are qualitatively similar, except that now the multiple-species (large 
variance) phase extends to ^ ~ 8 (A ~ 6). On the other hand, the intermediate region of wide 
distributions shrinks. 

As in the previous case, we analyze some significant contour plots (Figure E)). The down sloping 
shape of these lines, again, is a strong indication of a synergistic interaction of competition and 
assortativity on the speciation process. The contour plots show that for moderate competition 
there is a synergistic effect between competition and assortativity since a simultaneous increase of 
J and A may allow the crossing of the borderline whereas the increase of a parameter at a time 
does not. On the other hand, for larger values of J the phase diagram shows a reentrant character 
due to the extinction of one of the species that cannot move farther apart from the other and 
therefore cannot relieve competition anymore. It can also be noticed that for J = the contour 
plot shows a change in slope due to extinction of one species owing to random fluctuations as 
shown earlier. 

The following differences with respect to the case of fiat static fitness can be detected: the 
curvature of the borderlines between the coexistence phases is higher, which indicates a stronger 
synergy between A and J; the absence of speciation for moderate J here is not due to finite size 
effects. The contour plots of the phase diagram in the case of steep static fitness are shown in 
Figure El 

In order to make the interpretation of the phase diagram easier, we now briefly summarize the 
most significant evolutionary patterns for various values of J and A. 

In a regime of random mating and absence of competition ( J = 0, ^ = 0) the initial frequency 
distribution quickly turns in an asymmetrical bell-shaped curve in the neighborhood of a; = 
which then becomes a delta-peak in x = after T = t. The situation is basically the same for any 
value of assortativity, the only partial exception being represented by the fact that for maximal 
assortativity {A = 14) the initial distribution immediately turns in a delta peak in x = followed 
by a tail of low frequency mutants. These mutants however disappear after T = r so that the 
stationary distribution is still represented by a single delta-peak in x = and the variance equals 
zero. 

We now describe what happens for J = 2 so as to explain the abrupt increase in variance 
shown by the plot at high values of assortativity. For ^ < 11 the initial distribution turns in a 
symmetrical bell-shaped distribution and then in a delta-peak in x = so that the variance will 
be equal to zero. When ^ = 11 however, the initial distribution splits in two bell-shaped curves 
at the opposite ends of the phenotypic space that, after T = r turn in two delta peaks in x = 
and X = 10. This speciation event determines a sharp rise in variance from var = to var = 22. 
A second jump in variance can be observed with maximal assortativity, A = 14: in this case 
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Figure 3: Contour plots var = 5, 10, 15, 20, 25, 30, 35 for steep static fitness landscape. Parameters: 
P = 1, T = 10, a = 2, R = 4, p = 0.5. Annealing parameters: /xq = 10~^, fioo = 10~^, r = 10000, 
S = 3000. Total evolution time: 30000 generations. Each point of the plot is the average of 10 
independent runs. 



the initial distribution splits in two delta peaks in x = and x = 14 linked by a continuum of 
low-frequency mutants; after T = r the mutants disappear and three delta peaks appear at the 
opposite ends and in the middle of the phenotypic space. This is why the variance jumps to about 
30. 

When J = 4 in the assortativity range 0-10 the initial distribution moves towards x = and 
then becomes a delta peak in this position. When ^ = 11 the distribution splits in two bell-shaped 
curves one of which becomes a delta peak in x = and the other becomes a peak covering two 
phenotypes, usually x = 11 and x = 12. It should be noted that for J = 2 the second species 
was represented by a delta-peak too; the higher competition experienced for J = 4 however can 
be more easily relieved spreading the species over several phenotypes. The necessity to relieve the 
competition pressure also explains why for ^ = 13 not two (as with J = 2) but three delta-peaks 
appear in the final distribution. When ^ = 14 however the final number of delta peaks is five as 
with J = 2, probably because a higher number of peaks would make interspecific competition not 
sustainable with the competition range R = 4. 

When J = 6 the pattern is basically the same as with J = 4 apart for a couple of differences: 
when ^ = 8 the distribution instead of becoming a delta peak in x = splits in two bell-shaped 
distributions that after T = r shrink and move towards the opposite ends of the space so that the 
variance can reach a final value of about 24; another important difference is represented by the 
fact that the appearance of a delta peak in x = and a peak covering x = 11, 12, 13 now occurs 
with ^ = 9 and not ^ = 11 as with J = 4 which represents a typical example of the synergetic 
interplay between competition and assortativity. 

When J = 8 in the range ^ = to ^ = 5 the initial distribution does not become a delta peak 
in X = but it remains a bell-shaped (even if asymmetrical) curve with a variance around 4.5. 
As a consequence, the variance plot now increases more gradually and not in a stepwise manner 
as observed so far. When A is in the range 6-8 however, the variance jumps to 20-25 because the 
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distribution splits in two bell-shaped curves as with J = 6 could happen only with ^ = 8. When 

^ = 9, contrary to what observed with J = 6 no delta peak can appear in a; = 0, but the final 
distribution comprises a first peak covering phenotypes a; = 0, 1, 2, 3 and a second peak spanning 
phenotypes x = 9,10,11,12,13 so as to relieve the competition pressure. Finally, for ^ > 11 
three delta peaks appear at the ends and in the middle of the phenotypic space: this pattern for 
< J < 8 could be observed only with maximal assortativity A — 14:. 

For J = 10 the effects of this extremely high competition are most evident at low assortativity. 
When ^ = the initial distribution becomes wide and fiat covering the phenotypic range 0-12 
and after T = r it becomes slightly bimodal. As a consequence, the variance is very high reaching 
var = 10. This pattern becomes more and more extreme as assortativity increases, until, for A = 4 
the distribution spans the whole phenotypic range 0-14 and it becomes more markedly bimodal 
with maximal frequencies on a; = 0, 1, 2 and x = 12, 13. The variance increases accordingly up to 
var = 20. For A> 4 the pattern is the same as for J = 8. 

5 Conclusions. 

A microscopic model has been developed for the study of sympatric speciation i.e. the origin of 
two reproductively isolated strains from a single original species in the absence of any geographical 
barrier. 

In all our simulations we employed a simulated annealing technique, starting the run with a 
very high mutation rate that then is decreased according to a sigmoidal function which allows to 
attain the stationary distribution in a reasonably short runtime. 

We showed that in a flat static fltness landscape, assortativity alone is sufficient to induce 
speciation even in the absence of competition. This speciation event, however, is only transient, 
and soon one of the two new species goes extinct due to random fluctuations in a finite-size pop- 
ulation. A stable coexistence between the new species, however, could be achieved by introducing 
competition. In fact, intraspecific competition turned out to stabilize the two groups by operating 
a sort of negative feedback on population size. 

The simulations also showed that the assortativity level necessary for speciation could be re- 
duced as competition is increased and vice versa (except for the regime of extremely high compe- 
tition), which strongly argues for a synergistic effect between the two parameters. 

Similar patterns could be observed with a steep static fitness landscape. A high assortativity 
level is sufficient to induce speciation, but in the long run only the peak with maximal fitness 
survives. The coexistence of the two species again is stabilized by competition. 

A special attention was devoted to finite size effects. We showed that imposing maximal 
assortativity, in the presence of moderate competition, it was possible to reduce dispersion of 
offsprings and thus stabilize genotypically homogeneous peaks. In particular, in our 15-phenotypes 
fitness space, it was possible to stabilize the coexistence of three species, whose peaks tended to 
become symmetrical as the steepness of the fitness landscape was reduced. 

We also showed that speciation has the character of a phase transition, as the variance versus 
assortativity and competition surface shows a sharp transition from a low variance region corre- 
sponding to one species, to a high-variance region corresponding to two species. The curvature 
of the phase boundary once again supports the idea of a synergistic effect of competition and 
assortativity in inducing speciation. 

It is quite interesting to observe the behavior of the average of the fitness landscape H with 
respect to time in the various simulations. In general there is a large variation in correspondence 



13 



of the variation of the mutation rate. One can observe that there are cases in which H increases 
smoothly when decreasing /i, while others, in correspondence of weak competition and moderate 
assortativity exhibit a sudden decrease, often coupled to further oscillations. At the microscopic 
level, this decrease corresponds to extinction of small inter-populations that lowered the competi- 
tion levels. 

We observe here a synergetic effect among mutation levels, finiteness of population and assor- 
tativity. In an infinite asexual population, assuming that there is a mutation level sufficient to 
populate each phenotype, every variation of the population that would increase the survival prob- 
ability is always kept. Thus, an increase in the mutation levels would lower H, since the offspring 
distribution is more dispersed of what would be the optimum. So, the maximum of H is reached 
for n —>■ 0, with the condition that there are still sufficient mutations to populate all strains. 

When population is finite this is no longer true. First of all there are stochastic oscillations, and 
also the possibility of extinction of isolated strains, which are hardly repopulated by the vanishing 
mutations. Competition, by favoring dispersion, relieves this effect. 

We observed that the decrease in mean fitness that sometimes can be seen after the decrease of 
the mutation rate, is strongly affected by assortativity in a non-linear way. This lowering of fitness 
is due to the increase in intraspecific competition caused by the extinction of the low-frequency 
intermediate phenotypes. Maximal assortativity in fact, prevents dispersion of offsprings so that 
several peaks can appear in the middle of the phenotypic space relieving intraspecific competition. 
Slightly lower assortativity values are too large to stabilize new peaks in the central regions of the 
phenotypic space and too large to stabilize a wide distribution so that they lead to a significant 
decrease in mean fitness. A further decrease of the assortativity, on the other hand, favors a wide 
bell-shaped distribution so that the decrease in mean fitness will be very modest. 

We have also checked that this scenario remains the same if all length quantities are rescaled (lin- 
early) with the genome size. As a comparison, this finding does not hold when a non-recombinant 
population is in competition with a recombinant one , consistent with the fact that the genome 
of non-recombinant organisms is much shorter than that of recombinant ones. 

These patterns were observed treating the mutation rate as a tunable parameter that changes 
during the simulation according to a sigmoid function. It would be interesting to study the behavior 
of a recombinant population when the mutation rate is an evolutionary character. This will be the 
topic of a future work. 
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A Simulations. 

In order to check the dependence on the initial distribution, the frequency of phenotypes in the 
initial generation is binomially distributed: 



In the binomial distribution both the mean and the variance depend only on L (the genome 
length) and p (the probability of an allele being equal to 1): 
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X = Lp 



var = Lp{l — p) 

For each parameter set we systematically performed simulations from five different initial dis- 
tributions defined hj p = 0, p = 0.25, p = 0.5, p = 0.75, p = 1. The situations p = and p = 1 
refer to the limit cases in which the population is completely concentrated on phenotypes x = 
and X = 1 respectively, while in the p = 0.5 case the initial distribution is centered in the middle 
of the phenotypic space. Unless otherwise specified, the results of the simulations are independent 
from the initial distribution. This is valid also when different runs generate quantitatively different 
final distributions. 

Another problem we addressed is that of finding the best approximation of the stationary 
distribution of the population. Our model employs a finite population and is affected by several 
stochastic elements e.g. in the choice of individuals in the reproduction and selection step, so that 
there will always be small fluctuations from generation to generation. The presented results are 
obtained by averaging over a time interval able to average over these fluctuations but not hindering 
eventual drift effects. 

A.l Flat static fitness landscape. 

One of the simplest situations we can conceive, is a flat static fitness profile in presence or absence 
of competition. 

A. 1.1 Effects of assortativity. 

A. 1.1.1 Random mating. In this conditions, in a regime of random mating, a population 
is unable to speciate and, even employing extremely high competition levels, the final state is a 
trimodal frequency distribution. The situation is shown in Figured (left panel). Very early during 
the simulation, when the mutation rate is very high, two humps appear at the opposite ends of 
the phenotypic space so as to minimize the mutual competition while the central hump is fed 
by the offsprings of crossings between the other two humps. The continuous regeneration of the 
intermediate phenotypes as a result of the random mating, prevents the distribution from splitting 
into distinct species. When T ^ r the mutation rate becomes very low and transient peaks appear 
at X = and x = L = 14 because these positions are very favorable in that individuals with these 
phenotypes experience a very low competition level. These peaks however are very short-lived and 
they appear and disappear very soon because of the dispersion of the offspring due to the random 
mating regime. In Figure |3] we show the trimodal distribution that appears for T <^ r, and the 
distribution with a transient peak in a; = 14 appearing after T = r. 

When p = 0.5 the initial distribution is centered aXx = L/2 = 1 and it covers several phenotypes 
in the central region of the phenotypic space. The high mutation rate at the beginning of the 
simulation causes the distribution to become trimodal and extended to all the phenotypic space. 
This leads to an abrupt increase in the variance of the frequency distribution while the mean always 
oscillates around the same constant value x = L/2 = 7 because the deformation of the distribution 
is symmetric. The average fitness (Figure left panel) also undergoes a significant increase 
because the population is now distributed on more phenotypes so that the number of individuals 
per phenotype is small and the competition level lowers accordingly. As already mentioned, for 
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Figure 4: A polymodal frequency distribution generated in a random mating regime (A = 14) with 
an extremely high competition intensity. Parameters: J = 16, a = 2, R = 7. Left paneh trimodal 
distribution (averaged over generations 5000-5010); right paneh appearance of a transient peak in 
X = 14 (averaged over generations 30000-30010). Except for the oscillations in the right tail, all 
other peaks and valleys are stable and localized. 



T ^ r transient peaks in x = and x = L do appear, so that the mean of the distribution 
now shows wider oscillations while the variance further increases because the formation of peaks 
in X = and x = L implies the increase in frequency of phenotypes far from the mean of the 
distribution. The average fitness, on the other hand, also increases as a result of the increase of 
phenotypes x = and x = L that experience very low competition. Figures |S1 shows the plots of 
average fitness, variance and mean. 




5 10 15 20 25 30 5 10 15 20 25 30 



thousands generations thousand generations 

Figure 5: Average fitness, variance and mean in a regime of random mating (A = 14) and strong 
competition (J = 16, a = 2,R = 7). Annealing parameters: /iq = 10~^, /ioo = 10^^, r = 10000, 
6 = 3000. Total evolution time: 30000 generations. 



A. 1.1. 2 Moderate assortativity. The scenario changes completely if we impose a regime of 
assortative mating. In this case, even in the absence of competition, very interesting dynamical 
behaviors ensue. 

As an illustration, let us consider the case L = 14. If we set a moderate value of assortativity 
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A = 4, the frequency distribution progressively narrows until it becomes a sharp peak at the 
level of one of the intermediate phenotypes. This behavior can be easily explained. Regardless of 
the value of the parameter p, due to the very high initial mutation rate, the population becomes 
rapidly distributed following a very wide and fiat bell-shaped frequency distribution whose average 
shows little oscillations in the phenotypic space. 

The situation, however, changes significantly after T = r when the mutation rate becomes 
very low. As an experimental observation, the final frequency distribution is a delta peak typically 
located in one of the central phenotypes ^, the variety of positions being the result of the stochastic 
factors affecting the reproduction and selection steps. 

Figure El shows the plots of mean and variance for this simulation. The final delta-peaks are also 
genotypically homogeneous (otherwise they will generate broader distributions in the next time 
step). Indeed, since the static fitness is fiat, any genotypically homogeneous population is stable 
except for random sampling effects. 

After the population has become genetically homogeneous, the only fluctuations are due to 
mutations, that sporadically generates mutants. However, this effect cannot be seen in the Figure, 
due to the scale of the axis and population size. The time eventually required for a genotypic shift 
(fixation of a gene in the population) is so long that this phenomenon could not be observed. In 
the presence of competition the gene fixation is extremely difficult due to the competition of the 
newborn mutants with the rest of population. 

The fact that this final delta-peak distribution is actually observed in simulations is due to the 
short mating range that tends to reproductively isolate species. 

In the case of absence of competition the average fitness plot is not particularly interesting: the 
total fitness in fact, reduces to its static component which is equal to 1 for all phenotypes so that 
the average fitness plot will also be a constant. The variance plot, conversely, shows a decrease 
for T > T which corresponds to the shrinking of the distribution from a bell-shaped curve to a 
delta-peak. 

As shown also in the phase diagram of figure |2l there is a qualitative change of variance for 
3 < A < 4, for many values of J, indicating speciation. However, in the absence of competition, 
this speciation effect is only transient, and the final distribution is a delta-peak. 

The mean plot, on the other hand, oscillates around a constant value with oscillations that 
become wider and more irregular during the transition from the bell-shaped to the delta-peak 
distribution. 

^Typically in the range 6 to 9 and less often also in x = 4, a; = 5 and a; = 10 
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Figure 6: Variance and mean in a regime of weak assortativity (A = 4) and absence of competition 
(J = 0). Annealing parameters: /iq = 10^^, /ioo = 10~^, r = 10000, 6 = 3000. Total evolution 
time: 30000 generations. 



A. 1.1. 3 Strong assortativity. The results of a typical simulation with a higher level of assor- 
tativity A = 1 are shown in Figure [7| At the beginning of the run, instead of a wide bell-shaped 
distribution (as in the case A = 4), we find a distribution covering the whole phenotypic space 
with very low frequencies on the intermediate phenotypes and high frequencies on the extreme 
pheno types in the regions around a; = and x = 14. This corresponds to an abrupt increase in 
variance, while the mean oscillates around the value x = 7. 

This particular distribution appears because the short mating range enforces matings only 
among very similar phenotypes. However, while matings among intermediate phenotypes with 
almost equal numbers of bits 1 and produce offsprings different from the parents that are dis- 
tributed over several phenotypes (thus lowering the central part of the distribution), the matings 
between extreme phenotypes with a prevalence of zeros or ones do not scatter the offsprings that 
remain similar to the parents. When the mutation rate decreases for T > r, the scarcely populated 
intermediate phenotypes are removed by random fluctuations and only two delta peaks survive at 
the opposite ends of the phenotypic space (which is marked by another more modest increase in 
the variance around T = r = 10000): a speciation event has taken place. The coexistence of the 
two species however is not stable and later on one of the two peaks disappears still due to random 
fluctuations. As a result, the variance decreases abruptly to zero while the mean becomes constant. 
The final delta peak representing the stationary distribution is the survivor of the two peaks that 
had appeared at the opposite ends of the phenotypic space and for the same recombination effect 
discussed above, this asymmetric configuration ^ persists forever. 



A. 1.1. 4 Maximal assortativity. We finally explore the situation of maximal assortativity 
(A = 0). For T < T the situation is the same as with A = 1: a distribution covering all the phe- 
notypes, with low intermediate and high extreme values. For T > r however, a sort of "bubbling" 
activity can be seen on the intermediate phenotypes with appearance and disappearance of many 
peaks. This is a consequence of the finite size of the population (A^ = 3000) that enables the 

^The positions of the final deha peaks are clustered in the intervals 2-4 and 11-12 



18 



Mean 
Variance 




5 10 15 20 25 

thousands generations 



1 35 




- 30 




- 25 




- 20 


Lance 


15 


vari 


10 




- 5 




J 





30 



Figure 7: Variance and mean in a regime of strong assortativity (A = 1) and absence of competition 
(J = 0). Annealing parameters: /iq = 10~^, /ioo = 10~^, r = 10000, S = 3000. Total evolution 
time: 30000 generations. The modest decrease in the mean that can be seen near the end of the 
simulation corresponds to a shift of the delta-peak from x = 3 to x = 2. 



scarcely populated intermediate phenotypes to be genotypically homogeneous; if we also consider 
that each individual can only mate with other individuals with the same phenotype it can be seen 
that there is no dispersion of offsprings so that the peaks of intermediate phenotypes will be hard 
to eradicate with random fluctuations. On the long run, however, only two peaks at the opposite 
ends of the phenotypic space will survive because their frequencies were high already at T = r. 
Finally one of the two peaks goes extinct and the stationary distribution is represented by a delta 
peak in one of the extreme regions of the phenotypic space. Figure |H1 shows the plots of variance 
and mean in a run with A = 0: it can be seen that the oscillations of both variance and mean for 
T > T are wider and more irregular than in the case of Figure [7| where A = 1. 
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Figure 8: Variance and mean in a regime of maximal assortativity (A = 0) and absence of com- 



petition (J = 0). Annealing parameters: /xq 
evolution time: 30000 generations. 
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A. 1.2 Effects of competition. 

The simulations just discussed show that assortativity alone is sufficient to induce speciation. The 
two newborn species, however, are not stable and soon one of them disappears due to random 
fluctuations. The following experiments show that competition may stabilize multispecies coexis- 
tence. With this respect, it must be remembered that competition is inversely correlated with the 
phenotypic distance and therefore the competition among individuals with the same phenotype 
is maximal. Besides, competition is also proportional to the population density. As a result, if 
the number of individuals of a species increases (owing to random sampling, for instance), the in- 
traspecific competition increases as well, leading to a decrease in fitness which, in turn, determines 
a reduction of the population size at the following generation. In conclusion, competition acts as 
a stabilizing force preventing the population from extinction. 

A. 1.2.1 Weak competition. We begin our discussion with the case of weak competition (J = 
1, a = 2, R = 2) and strong assortativity (A = 1). At the beginning of the simulated annealing 
simulation, the distribution extends over all the phenotypic space with low frequencies on the 
intermediate phenotypes and high frequencies on the extreme ones because the latter prevalently 
produce offsprings similar to the parents while the former spread their offsprings over several 
different phenotypes. When T = r the mutation rate significantly decreases and two different 
scenarios may arise: 

1. Stable coexistence of three species represented by delta-peaks located in the phenotypic space 
so as to maximize the reciprocal distance and thus minimize competition ^. 

2. Stable coexistence of two species located close to each other in the phenotypic space (5-6 
phenotype units apart). This small distance is possible because competition is rather weak 
(J = 1, a = 2, i? = 2) ^ 

Figure El shows the final distribution (generation 40000) obtained in a run with p = 0. 
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Figure 9: The final distribution (generation 40000) obtained with weak competition ( J = 1, a = 2, 
R = 2) and strong assortativity (A = 1). 



■^The first one is typically in a; = 0, x = 1, or a: = 2, the second one is in 2: = 6, a:: = 7, or x = 8 and the third 
one is in a; = 11, a; = 12 or a; = 13. 

^The first peak is typically in a; = 3 or a; = 4 while the second one is in a; = 9, a; = 10 or a; = 11. 
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The general idea is that competition stabihzes the coexistence and prevent extinction due to 
random fluctuations. The mutation plays almost no role in the presence of competition, except in 
offering individuals the opportunity of populating empty "niches". The distribution reported in 
figure is stable because the peaks are separated by a distance greater than the competition range. 
If for some reason, the central peak disappears and if one waits long enough, then mutations may 
repopulate this position (which is free from competition), or the other peaks may shift and occupy 
intermediate positions, which is the other stable configuration. However, extinction of the central 
peak would be preceded by a decrease in its population, with a consequently decrease of intraspecific 
competition, and an increase the population of the other peaks since we are working at fixed 
population. This in turn would increase the intraspecific competition of the two external peaks, 
thus lowering their fitness and this feedback brings the system back to the initial configuration. 

We now discuss the plots of average fitness, variance and mean. As p = 0, the initial dis- 
tribution is a delta-peak in a; = that is very quickly deformed in a distribution spanning the 
whole phenotypic space; this obviously leads to an abrupt increase of the mean and variance of 
the distribution, but also the average fitness increases significantly because the spreading of the 
population over several phenotypes relieves competition. This effect obviously is stronger when 
p = and p = 1 and this is why we are discussing the p = case. As shown in Figure EH the mean 
fitness oscillates at a very high level until T = t. After that, due to the decrease of the mutation 
rate, the individuals with intermediate phenotypes are removed by random fluctuations and the 
whole population will be concentrated on two delta peaks near the ends of the space. As a con- 
sequence of the very high intraspecific competition intensity in the two peaks, the average fitness 
drops abruptly. In this configuration the central part of the phenotypic space is not populated at 
all so that this is a particularly good location for a new colony to grow (very low competition). 
This is why two patterns may arise: the two peaks at the two ends of the phenotypic space may 
move closer to each other, or a third peak may appear in the middle of the phenotypic space. In 
the simulation we are describing these events are very fast and they do not exclude each other: 
the two peaks may move a little closer and then a third peak appears. The mean fitness however, 
does not change significantly: if a third peak appears, there is a decrease in the intraspecific com- 
petition that is partly balanced by an increase in the interspecific competition. The variance on 
the other hand will also decrease because the frequency of the central peak (that gives a very little 
contribution to the variance being near the mean) is higher than the sum of the old frequencies of 
the intermediate phenotypes. Finally, the mean of the distribution, jumps abruptly from x = to 
X = 7 at the beginning of the run but then remains constant because all the following deformations 
of the distribution are roughly symmetrical. 

A. 1.2. 2 Deterioration of the Environment and Fisher's theorem. As the decrease in 
mean fitness contradicts the traditional interpretation of Fisher's theorem, we further investigate 
this problem studying the role of assort at ivity. If assortativity is maximal (A = 0) the mean 
fitness remains approximately constant throughout the simulation. After the extinction of the 
intermediate phenotypes after T = r, we get a final distribution with four evenly spaced delta- 
peaks. This is possible because A = forces individuals to mate only with partners showing the 
same phenotype. As we have observed earlier, genetic drift in finite populations tends to make 
them genetically homogeneous thus preventing dispersion of offsprings. This is no longer true when 
A = 1 and no more than three delta-peaks can appear so that the intraspecific competition is high 
and the mean fitness drops abruptly. The decrease in fitness is even more significant when A = 2: 
the mating range is large enough to prevent the splitting of the initial distribution and we finally 
get a single peak spanning two phenotypes so that intraspecific competition is very high and fitness 
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Figure 10: Average fitness, variance and mean in a regime of strong assortativity (A = 1) and weak 
competition {J = 1, a = 2, R = 2). Annealing parameters: /xq = 10^^, /ioo = 10~^, r = 10000, 
6 = 3000. Total evolution time: 40000 generations. 



is minimal. The situation is slightly different for A = 3: even if the final distribution is the same 
as for 6 = 2, this condition is reached about 15000 generations later because the larger mating 
range tends to keep a wide bell-shaped distribution, this is exactly what happens when A = 4, 
when the large mating range stabilizes a wide bell-shaped distribution covering five phenotypes. 
The pattern becomes even more evident for A > 7 when the final bell-shaped distribution spans 
seven phenotypes. The plots clearly show that as the bell-shaped distribution becomes wider and 
wider the decrease in fitness becomes less and less significant as a consequence of the reduced 
intra-specific competition. Figure ITT] shows the plots of mean fitness we have just discussed. 

A. 1.2. 3 Role of competition range In the next simulation we discuss the role of the com- 
petition range R, in particular, we consider the case J=l, a = 2, R = Q with strong assortativity 
A = 1 and a flat static fitness landscape (3 = 100, F = 14. We have seen that when the com- 
petition range is small (for instance i? = 2 as in the experiment shown in Figures IHl and ^1 the 
stationary state is characterized by the coexistence of two or three species close to each other in 
the phenotypic space. On the contrary, when R = 4 only two species can coexist and they are far 
from each other being located at the opposite ends of the phenotypic space ^. In Figure IHl we show 
the final distribution of a run with p = 0. 

In this simulation the average fitness first increases abruptly when the initial delta peak in 
X = becomes a wide distribution covering all the phenotypic space then it oscillates around a 
constant value until T = r and finally it increases again because when the whole population crowds 
in the two delta peaks at the opposite ends of the phenotypic space, the increase in intraspecific 
competition is outweighed by the fact that the distribution moves away from the central region 
of the space where the fitness is very low due to the high value of the competition range R. The 
variance also increases at the beginning of the run when the delta peak turns in a wide distribution, 
and it increases again when this distribution splits in two delta peaks far from the mean. Finally, 
the mean increases abruptly from x = to a; = 7 and then it oscillates around this value. Figure IT^ 
shows the plots of mean fitness, variance and mean in a typical run. 

^Typically, the first peak is found at x = 0, and less often at a; = 1 while the second peak is usually at a; — 14 
and less often at x = 13. 



22 



A = 




5 10 15 20 25 30 35 40 45 50 

thousands generations 



Figure 11: Average fitness plots for several values of assortativity A = 0,1,2,3,4,7. Regime of 
weak competition: J = 1, a = 2, R = 2. Annealing parameters: /iq = 10^^, /ioo = 10^^, r = 10000, 
6 = 3000. Total evolution time: 50000 generations. 

A. 1.2. 4 High competition. We now consider the case of high competition intensity and small 
competition range. In particular, we will discuss a simulation with J = 8, a = 2, R = 2; for the 
sake of comparison with the previous simulations we will still keep A = 1. When the competition 
was weak (see for instance Figure ^ no more than three species could coexist in the phenotypic 
space. When J = 8, however, the competition pressure is so strong that the population cannot be 
distributed only in three species and a fourth species appears so as to relieve competition. The 
simulations show that the first species is always in x = and the fourth is always in x = 14. This is 
no surprise because these positions enjoy the lowest competition level; in fact the x = phenotype 
has got no competitors for x < and x = 14 has no competitors for x > 14. The second species 
sometimes appears in x = 5 but more often it is represented by a peak covering phenotypes x = 4 
and X = 5. In a similar fashion the third species sometimes appears in x = 9, but typically it 
is represented by a peak spanning over phenotypes x = 9 and x = 10. The fact that a species 
comprises more than one phenotype again, is an evolutionary solution to relieve a competition 
that would be unbearable if the species was concentrated on a single phenotype. Figure El shows 
the final distribution of a run with p = 0. 

If we now analyze the mean fitness plot, we will notice that H{x) is always negative whereas 
in Figure ^1 it was always positive: the high intensity of competition thus represents a measure 
of the deterioration of the environmental conditions - to use the language of Price and Ewens 
reformulation of Fisher's theorem - that keep the fitness to a very low level. It will be noticed that 
the mean fitness increases rapidly at the beginning of the run because spreading the population over 
several phenotypes relieves the competition. The mean fitness then oscillates around a constant 
value and it grows again when the mutation rate decreases and the frequency distribution splits 
into four peaks. In this case, contrary to what shown in Figure ^1 the decrease in intraspecific 
competition outweighs the increase in interspecific competition so that the fitness can increase. 

Finally, it is important to compare the result of the simulation with J = 8,a = 2, R = 2 with 
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Figure 12: The final distribution (generation 40000) obtained with weak competition intensity but 
high competition range [J = 1, a = 2, R = Q) and strong assortativity (A = 1). 
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Figure 13: Average fitness, variance and mean in a regime of strong assortativity (A = 1) and weak 
competition intensity but long competition range {J = 1, a = 2, R = 6). Annealing parameters: 
10-\ fioo = 10'^ r = 10000, 5 = 3000. Total evolution time: 40000 generations. 
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that of the experiment with J = 1, a = 2, R = 6 portrayed in Figures^! and El In both cases the 
competition is very strong, but when the competition strength is due to the radius of competition, 
no more than two species can coexist and they are located at the opposite ends of the phenotypic 
space (at least when R = 6); conversely if competition strength is due to the intensity parameter 
J, then four species will coexist (in the case J = 8) so as to minimize intraspecific competition. 
Figure IT31 shows the plots of mean fitness, variance and mean in the run with J = 8. 

A. 1.2. 5 Interplay between mating range and competition. So far we have shown that a 
high assortativity level alone (A = 1, A = 0) is sufficient to induce speciation, but the two newborn 
species cannot coexist for very long as one of them will be eradicated by random fluctuations. We 
then showed that the stable coexistence of the new species can be ensured by a high competition 
level, and in particular we showed the different effects of an increase in competition intensity 
and an increase in competition range. We now explore what happens if the mating range A is 
very large. In Section FA. 1.1. 21 we showed that in the absence of competition the speciation is not 
possible, not even in a transient way, if we set a large mating range A = 4. In the next simulation 
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Figure 14: The final distribution (generation 40000) obtained with strong competition intensity 
but short competition range {J = 8, a = 2, R = 2) and strong assortativity (A = 1). 
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Figure 15: Average fitness, variance and mean in a regime of strong assortativity (A = 1) and high 
competition intensity but short competition range {J = 8, a = 2, R = 2). Anneahng parameters: 
/io = 10-^ /ioo = 10-^ T = 10000, 6 = 3000. Total evolution time: 40000 generations. 



we show that speciation is indeed possible also in the case A = 4 if we set a sufficiently strong 
competition: J = 2, a = 10, R = 4. For the sake of comparison, recall that with A = 1 a much 
weaker competition was sufficient: J=l, a = 2, R = 2 (see Figure IHl and Figure ITO|l . The results 
of the simulations, show that the final distribution is composed by two delta-peaks at the opposite 
ends of the phenotypic space ^. 

The plots of mean fitness, variance and mean can be easily interpreted. Due to the high 
competition level the initial distribution (in our case a delta-peak in x = 0) splits within a few 
hundreds of generations in two bell-shaped distributions, the first one covering phenotypes x = 
to X = 6 and the second one spanning the phenotypes x = 8 to x = 14. The first distribution 
has got the highest frequencies on phenotypes 4, while the second one has 

got its maximum on x = 10, x = 11, x = 12. The width of the two distributions is due to 
the high mutation rate and it does not change until T = r. After that, when the mutation rate 
decreases the two distributions are narrowed. The first phenotypes whose frequency decreases 

^The first peak is typically placed ata; = lora: = 2ora; = 3ora; = 4 and the second one at a; = 11, a; = 12, 
X = 14, and less frequently also at a; = 9 and x = 14. 
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in both distributions are those placed towards the center of the phenotypic space because of the 
very high competition leveL Later on, because of recombination, there is a decrease in frequency 
also for the phenotypes of both distributions placed towards the ends of the phenotypic space. 
This evolutionary dynamics is reflected in the plot of mean fitness. The mean fitness increases 
abruptly at the beginning of the run because the formation of two bell-shaped distributions relieves 
the competition pressure. When the two distributions narrow down and become delta-peaks, 
the average fitness further increases only to a small extent because the decrease in interspecific 
competition is almost completely compensated for by the increase in intraspecific competition. 
Finally, it should be noticed that the high level of competition is also reflected by the fact that 
the average fitness H is always negative, while in Figure ^1 it was always positive. The plot of 
variance is also very interesting. After an abrupt increase at the beginning of the run, the variance 
oscillates around a constant value until T = r. After that the variance first increases and then 
decreases again. This is a consequence of the fact that the two distribution do not narrow down on 
a symmetric way but they first lower their frequencies in the regions pointing towards the center 
of the phenotypic space and only later they reduce their frequencies in the opposite end of the 
distribution. Figure UHl shows the plots of mean fitness, variance and mean that we have just 
commented. 
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Figure 16: Average fitness, variance and mean in a regime of weak assortativity (A = 4), high 
competition intensity and long competition range (J = 2, a = 10, i? = 4). Annealing parameters: 
yUo = 10-\ /ioo = 10~^ r = 10000, 5 = 3000. Total evolution time: 30000 generations. 



If we further increase the mating range A = 6, the competition level necessary to induce 
speciation becomes even higher: J = 10, « = 10, i? = 6. The evolutionary dynamics is the 
same as described in the case A = 4 and the final distribution is made up of two delta peaks 
located in the extremes of the phenotype space ''. The plot of mean fitness shows values that are 
even more negative than in the simulation with A = 4. Apart for this, there are no significant 
differences: the mean fitness increases abruptly at the beginning of the run and then undergoes a 
second, very modest increase after T = r. The variance, conversely after T = r first shows a very 
little increase and then undergoes a more substantial decrease. This happens because, due to the 
very high competition pressure, the two bell-shaped distributions that appear at the beginning of 
the simulation, become asymmetrical long before T = r showing a decrease in frequency of the 
phenotypes pointing towards the center of the phenotypic space. As a consequence, after T = r 

''One in x = 1, X = 2, a; = 3 or X = 4 and the second one in a; — 11, x — 12, x — 13. 
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there will be basically only the decrease in frequency of the phenotypes pointing towards the ends 
of the phenotypic space, hence the decrease in variance of the overall distribution. These patterns 
are shown in Figure IT^ 




-9 ' ' ' ' ' ' 1 I ' ' ' ' ' 1 

5 10 15 20 25 30 5 10 15 20 25 30 

thousands generations thousands generations 



Figure 17: Average fitness, variance and mean in a regime of very weak assortativity (A = 6), high 
competition intensity and long competition range (</ = 10, a = 10, i? = 6). Annealing parameters: 
/io = 10~^, fioo = 10~^, r = 10000, 6 = 3000. Total evolution time: 30000 generations. 

As a conclusion, our simulations in a fiat static fitness landscape show that competition acts 
in two ways: 

1. Competition stabilizes incipient species generated thanks to a high assortativity regime. 

2. Competition induces speciation events that would be otherwise forbidden when the mating 
range is very large. 

Our simulations also show that competition and assortativity act in a synergistic way as very 
high levels of competition allow speciation even when assortativity is low; conversely a regime of 
very high assortativity makes speciation possible even in the face of a weak competition. 

Finally, we have provided evidence that two dimensions of competition do exist: the competition 
intensity that induces the formation of a large number of species located at roughly equal distance 
in the phenotypic space so as to minimize competition pressure, and the competition range that 
induces the formation of a small number of species placed at maximal distance from each other. 

A. 2 Steep static fitness landscape. 
A. 2.1 Absence of competition. 

In our model we assume that the 0-allele represents the wild-type while the 1-allele is the least 
deleterious mutant. This entails that the larger the number of 1-alleles in a genotype, the lower the 
fitness. This is why a particularly interesting case to investigate is that of a monotonic decreasing 
static fitness. 

In the absence of competition the asymptotic distribution is a sharp peak on the master se- 
quence X = 0. In this case, the whole population crowds on the phenotype with the highest fitness 
level. The process is rather straightforward: at the beginning of the simulated annealing the initial 
frequency distribution is rapidly transformed in a distribution with a high peak in x = and a 
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long tail of mutants at very low frequencies spanning over the entire phenotypic space. For T > t 
the mutation rate decreases and the tail of mutants disappears so that the stationary distribution 
is represented by a delta peak in x = 0. In Figure ITHl we show the stationary distribution in a run 
with: /? = 1, r = 150, J = 0, A = 0. 
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Figure 18: The final distribution (generation 30000) obtained in absence of competition (J = 0) 
and with maximal assortativity (A = 0) in the case of steep static fitness landscape {jS = 1, 
r = 150) . Annealing parameters: /iq = ilO)'\ fi^ = (10)-^ r = 10000, 5 = 3000. Total 
evolution time 30000 

The behavior of the mean, variance and mean fitness of the distribution is straightforward and 
completely dominated by the mutation rate and the slope of the fitness landscape. As an example, 
when p = the initial distribution is a delta peak in a; = so that the mean fitness equals one 
and the variance and mean equal zero. Due to the high mutation rate at the beginning of the 
simulation, however, a long tail of mutants appear in regions with low static fitness so that the 
mean fitness drops abruptly while the variance and mean of the distribution do increase. Finally, 
after T = r the mutation rate decreases, the tail of mutants disappears and the distribution is 
turned again in a delta-peak in x = 0. As a consequence, the mean and variance of the distribution 
vanish again while the mean fitness increases again to unity. 

A. 2. 2 Stabilizing effects of competition. 

Once again, competition plays a key role as a powerful stabilizing force. When mating is random, 
the only effect of an extremely high competition is to induce the formation of a wide and flat 
frequency distribution spanning over all the phenotypic space, but no distinct species will appear. 

In conditions of high assortativity A = 3, a moderate level of competition (J = 2) is sufficient 
to induce the appearance of two asymmetric sharp peaks. The tallest peak appears on the master 
sequence because this is the position of the phenotypic space with the highest static fitness; the 
other peak, at the opposite end of the phenotypic space, is smaller because it is near the static 
fitness minimum but it won't go extinct because, being very far from the first peak, it enjoys the 
lowest possible interspecific competition. 

As an example, we describe the dynamic behavior in a simulation with the following parameter 
setting: /3 = 1, F = 10, J = 2, a = 2, i? = 4, A = 3. It is necessary to distinguish the case p = 0.5 
and p = 0. If p = 0.5 the initial distribution quickly splits in two bell-shaped distributions, one 
near x = and one in the neighborhood of x = 14. The distribution near x = is asymmetrical 
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having higher frequencies on the phenotypes with higher static fitness; conversely, the distribution 
near a; = 14 is flat and wide because of the low values of the static fitness in that region. The 
appearance of the two distributions at the opposite ends of the phenotypic space is related to a 
steep increase in mean fitness because spreading the population over several phenotypes relieves 
competition. For T > r the mutation rate decreases and the two bell-shaped distributions narrow 
down as a result of recombination and the second one moves closer to the first one following the 
static fitness gradient so that both the variance and the mean are decreased while the mean fitness 
increases to a small extent. On the contrary, if p = 0, only an asymmetrical bell-shaped distribution 
near a; = appears at the beginning of the simulation. A second bell-shaped distribution appears 
near x = L only much later during the simulation when a colony of mutants large enough to resist 
to random fluctuations will be created in that region of the phenotypic space. As a consequence, 
when p = the abrupt increase in mean fitness will be delayed as compared to the case p > and 
the same applies to the variance and the mean. For T > t however, the pattern is the same as for 
p > and in the final distribution we always find a tall delta peak in x = and a second smaller 
peak near the opposite end of the phenotypic space ^. Figure IT^ shows the plots of mean fitness, 
variance and mean in the cases p = and p = 0.5. 

As mentioned above, when competition is weak (J = 2) the final distribution is formed by two 
asymmetric delta peaks. The peak in x = is populated by 65% of the population while the peak 
in a; = 10 accounts for the remaining 35% of the organisms. When the competition intensity is 
increased to J = 4, the final distribution still shows two peaks, but these are almost symmetrical: 
the delta peak in x = is populated by about 55% of the individuals, while the second peak which 
is usually a delta peak in a; = 11 but sometimes also a peak spanning over phenotypes a; = 10 
and a; = 11 or phenotypes a; = 11 and x = 12, comprises the remaining 45%. Figure EIH shows the 
typical final distribution attained in these simulations. 

A. 2. 3 Competition-induced speciation. 

If competition is increased up to J = 7, the population splits in two species with approximately 
the same frequency but whose distributions show very different geometries. While the master 
sequence species still exhibits a sharp peak distribution populated by about 55% of the population, 
the species near x = L shows a wide and fiat bell-shaped frequency distribution so as to minimize 
the intraspecific competition ^. The final distribution obtained in a simulation with J = 7 and all 
the other parameters unchanged as compared to the previous example is shown in Figure |^ 

We will now investigate what happens if J is increased to 10 while the other parameters are 
left unchanged. At the beginning of the simulation a frequency distribution appears, that spans 
the whole phenotypic space and shows maximal frequencies near a; = and x = 14; there is also 
a hump in the central region of the phenotypic space: these are the regions where competition is 
minimal. Moreover, the phenotypes close to x = and x = 14 enjoy a very little dispersion of 
offsprings. The creation of such a distribution implies a first significant increase in mean fitness, 
variance and mean. Immediately after T = r the mutation rate decreases and three delta-peaks 
appear in x = 0, x = 7 and x = 14. This leads to a new, significant increase in mean fitness 
because, contrary to what happens in the cases J = 4 and J = 7, the effect of the increase in 
frequency of phenotype x = that enjoys the maximal static fitness, is not canceled out by the 
increase in intraspecific competition, this event being prevented by the formation of a delta-peak 

^Usually in position a; = 10 but less often also in a; = 9 and x — 11. 

^This second distribution usually extends over phenotypes x = 11, x = 12 and a; = 13 but sometimes also over 
phenotypes a; = 10 and/or x — 14. 
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Figure 19: plots of mean fitness, variance and mean for p = and p = 0.5 obtained with moderate 
competition ( J = 2) and assortativity (A = 3) in the case of steep static fitness landscape (/3 = 1, 
r = 10) . Annealing parameters: /xq = {10)~\ /ioo = (10)-^ r = 20000, 6 = 1000. Total evolution 
time 30000 
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Figure 20: The final distribution (generation 40000) obtained with competition intensity J = 2 
(left panel) and J = 4 (right panel) for a moderate assortativity A = 3 in the case of steep static 



fitness landscape {(3 = 1, T = 10) . Annealing parameters: /xq 
5 = 1000. Total evolution time 40000 
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in intermediate position. Immediately after T = r there is also a significant increase in variance 
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Figure 21: The final distribution (generation 40000) obtained with competition intensity J = 7 
and assortativity A = 3 in the case of steep static fitness landscape (/5 = 1, F = 10) . Annealing 
parameters: /xq = (10)"\ yUoo = (10)~^ r = 20000, 5 = 1000. Total evolution time 40000 



because there is a significant increase in frequency of phenotypes a; = and x = 14, that being 
very far from the mean, provide a large contribution to the variance, while in the same time there 
is a decrease in frequency of the phenotypes in intermediate position close to the mean. As usual, 
in the case p = 0.5 the mean fitness and variance of the initial population is very large so that the 
initial increase of these parameters will be more limited. In Figure |221 we show a typical example 
of final distribution when J = 10. 
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Figure 22: The final distribution (generation 40000) obtained with competition intensity J = 10 
and assortativity A = 3 in the case of steep static fitness landscape (/? = 1, F = 10) . Annealing 
parameters: /xq = (10)"\ /ioo = (10)-^ r = 20000, 6 = 1000. Total evolution time 40000 

In Figure |2H1 we also show the plots of mean fitness, variance, mean in the cases p = and 
p = 0.5 when J = 10. 

A. 2. 4 Interplay between competition and assortativity in a steep fitness landscape. 

Figure |221 shows that a stationary distribution with three delta peaks can be obtained in the 
presence of moderate assortativity (A = 3) if competition intensity is sufficiently high (J = 10). 
We now show that this pattern can be obtained also with a much weaker competition (J = 4) if 
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Figure 23: Plots of mean fitness, variance and mean for p = and p = 0.5 obtained with competi- 
tion intensity J = 10 and assortativity A = 3 in the case of steep static fitness landscape (/3 = 1, 
r = 10) . Annealing parameters: /xq = (10)-\ /ioo = (10)-^ r = 20000, 5 = 1000. Total evolution 
time 40000 



assortativity is maximal (A = 0). Maximal assortativity in fact, induces geno typical homogeneity 
in the strains of a population thus preventing dispersion of the offsprings. This is a very important 
condition for a peak in the middle of the phenotypic space to resist eradication, as it experiences 
strong competition from both extreme strains. The symmetry of the stationary distribution de- 
pends on the steepness of the static fitness. When the static fitness is not so steep (T = 50) 
the three peaks are almost equal: x = (40%), x = 7 (30%), a; = 14 (30%); conversely, when 
the fitness profile is very steep (F = 2) the peaks are very asymmetrical: a; = (50%), a; = 8 
(20%), a; = 14 (30%). The stationary distributions obtained with F = 2 and F = 50 are shown in 
Figure |211 

The plots of mean fitness, variance and mean only show quantitative differences from those 
shown in Figure |221 and therefore they are not presented. 

A. 3 Influence of genome length. 

The simulations with both flat and steep static fitness landscape discussed so far, showed that the 
mating range A has got a strong influence on the evolutionary dynamics and on the stationary 
distribution. Small values of A in fact, make speciation possible in the presence of weak competition 
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Figure 24: The final distribution (generation 30000) obtained with competition intensity J = 4 
and maximal assortativity A = in the case of an extremely steep static fitness landscape (/5 = 1, 
r = 2) (left panel) and a moderately steep one (/5 = 1, F = 50) (right panel) . Anneahng 
parameters: /io = {10)~\ f^oc = (10)-^ r = 20000, 5 = 1000. Total evolution time 30000 



and sometimes even in the absence of competition. One however may argue that it makes sense to 
talk about a small or large mating range, only with reference to the genome length L because this, 
on turn, determines the number of possible phenotypes and hence the size of the phenotypic space. 
In all the simulations discussed so far the genome length was set to L = 14. We now perform 
simulations with L = 28 to show that the final distribution is not affected by the genome length. 

We begin with simulations with flat static fitness landscape. These simulations will have to be 
compared with those shown in Figures 1^ and ITUl When the genome length is doubled from L = 14 
to L = 28, the competition range R and the mating range A must also be doubled from 2 to 4 
and from 1 to 2 respectively. We also doubled the steepness parameter F of the static fitness from 
14 to 28 so as to achieve a fiat landscape. 

The evolutionary pattern for L = 28 is basically the same as with L = 14. The initial distribu- 
tion quickly extends over the whole phenotype space with higher frequencies in the regions near 
a; = and x = 28. The phenotypes in the range x = 1 to x = 5 and x = 23 to x = 27 corresponds 
roughly to 5% of the population each, while the frequencies of phenotypes from x = 6 to x = 22 
range from 2.5 to 3 %. It therefore appears that the positions of the peaks when L = 28 are 
roughly the same as the positions observed with L = 14 multiplied by 2. The creation of this wide 
distribution corresponds to a significant increase in mean fitness and variance. After T = r the 
intermediate phenotypes tend to disappear and the final distribution will be represented by two 
or three delta peaks. When the final distribution is represented by three delta peaks the first one 
is typically located in x = 2, x = 3, x = 4 and less often also in x = 1 and x = 5; the second peak 
typically lies in the range 13 to 16 and the third peak in the range 23 to 26. When only two peaks 
appear in the stationary distribution, the first one is usually in the range 7 to 9 while the second 
one is in the range 21 to 24. The appearance of these delta peaks after T = r is characterized by 
a significant decrease both in mean fitness and variance. The decrease in mean fitness is due to 
the increase in intraspecific competition that occurs when the population initially spread over the 
whole phenotype space concentrates on few phenotypes only. The decrease in variance conversely, 
is due to the extinction of the intermediate phenotypes that provided to the variance many small 
contributions that summed up to a considerable value. The intermediate phenotypes are replaced 
by a single peak not far from the mean of the distribution whose contribution to the variance is 
very small. The stationary distribution and the plots of mean fitness, variance and fitness are 
shown in Figures 031 and QUI 
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Figure 25: Genome length L = 28. The final distribution (generation 40000) obtained with 
competition intensity J = 1 and assortativity A = 2 in the case of fiat fitness landscape {(3 = 100, 
r = 28) . Annealing parameters: /io = (10)~^, /ioo = (10)~^, r = 10000, 6 = 3000. Total evolution 
time 40000 
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Figure 26: Genome length L = 28. Plots of mean fitness, variance and mean obtained with 
competition intensity J = 1 and assortativity A = 2 in the case of fiat fitness landscape (/? = 100, 
r = 28) . Annealing parameters: /io = (10)-^, /ioo = (10)~^ r = 10000, 6 = 3000. Total evolution 
time 40000 



We now consider a simulation with a steep static fitness landscapes that can be compared with 
the one shown in Figures |221 and EHl Also in this case the doubling of L to 28 was paralleled 
by a doubling of F, R and A to 20, 8 and 6 respectively. The initial distribution splits in three 
distributions at the opposite ends and in the middle of the phenotypic space, which leads to a 
substantial increase in mean fitness and variance. After T = r the three distributions become 
delta peaks in a; = 0, a; = 14 (but occasionally also in x = 15) and in x = 28. When this happens 
a second increase in mean fitness and variance takes place. The stationary distribution and the 
plots of mean fitness, variance and fitness in a simulation with p = 0.5 are shown in Figures EH 
andEHl 
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Figure 27: Genome length L = 28. The final distribution (generation 40000) obtained with 
competition intensity J = 10 and assortativity A = 6 in the case of steep fitness landscape (/3 = 1, 
r = 20) . Annealing parameters: /io = {10)~\ /ioo = (10)-^ r = 20000, 6 = 1000. Total evolution 
time 40000 
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Figure 28: Genome length L = 28. Plots of mean fitness, variance and mean obtained with 
competition intensity J = 10 and assortativity A = 6 in the case of steep fitness landscape (/3 = 1, 
r = 20) . Annealing parameters: /xq = (10)-\ /Xoo = (10)-^ r = 20000, 6 = 1000. Total evolution 
time 40000 
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